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Abstract 

We study stochastically perturbed non-holonomic systems from a geometric point 
of view. In this setting, it turns out that the probabilistic properties of the perturbed 
system are intimately linked to the geometry of the constraint distribution. For G- 
Chaplygin systems, this yields a stochastic criterion for the existence of a smooth 
preserved measure. As an application of our results we consider the motion planning 
problem for the noisy two-wheeled robot and the noisy snakeboard. 
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1 Introduction 

The goal of this paper is the study of stochastic non-holonomic systems. This is a natural 
continuation of the work on stochastic Hamiltonian systems pioneered by Bismut [5] and 
revitalized, brought up to date, and expanded by Lazaro-Cami and Ortega [32] who also 
connected it to symmetries, momentum maps, and reduction. 

l.A Motivation and basic idea 

A non-holonomic system is, essentially, a rigid body together with a set of constraints on 
the velocities. A prototypical example is the Chaplygin ball ([10]; for a modem treatment 
see [13] and [12, Chapter 6]). Here, the configuration space is the direct product Lie 
group G = S0(3) X M^, describing orientation and position of the ball, and the kinetic 
energy is specified by a left-invariant metric there are two (non-integrable) velocity 
constraints so that the ball does not slip, i.e., the point of contact of the ball and the plane 
has zero velocity. Without constraints (which is clearly not the case in the problem just 
presented), this would describe the motion of a rigid body in the plane, hence it would be 
a Hamiltonian system. 
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Stochastically perturbed versions of the latter setting (i.e., without constraints) have 
been considered by Lazaro-Camf and Ortega [33, Section 7.3]: Let be the kinetic en- 
ergy Hamiltonian of a left invariant metric on the Lie group G, an orthonormal 
basis of the Lie algebra q of G, {uj},; its extension to a left invariant frame on G, and 
/i* : T*G —7- M, {q,p) i— ?■ {p, Ui{q)). Note that h' is the component (J^, Ui) of the momen- 
tum map : T*G ^ Q* defined by the lift to T*G of right translation of G on itself. 
The M X g*-valued function H = {Hq, h') on T*G is left invariant. Following [32, 33] and 
assuming that the perturbation is given by white noise, the stochastic rigid body is thus 
modeled by the Stratonovich equation 



where Xh denotes the Hamiltonian vector field of the function /i : T*G — )■ M and W = 
{W^} is Brownian motion in g = M". A physical system modeled by this equation is 
that of a rigid body subject to small random impacts. Note that, since Ui is auto-parallel 
for the Levi-Civita connection, the equation 5r = ^ Xhi {T)SW'^ yields the Hamiltonian 
construction of Brownian motion, as in [32]. 

To pass to the nonholonomic setting, we note that the equations of motion of the con- 
strained (Chaplygin) ball can be encoded in the vector field PX/^o where P is the con- 
straint force projection and is defined in (2.12) below. The effect of P is to force the 
dynamics generated by X^o to satisfy the constraints. Thus, the idea of 'the Hamiltonian 
construction of stochastic non-holonomic systems' is to apply P to (1.1). In fact, since 
PXho is nothing but the non-holonomic vector field (see Section 2), we will focus on 
studying the effects of P on the second term in equation (1.1). This yields non-holonomic 
constraints on the operator which is used to construct Brownian motion, thus leading to 
'constrained Brownian motion' described by 



As it stands, this equation has some problems. It depends very much on the basis {ui}i 
that was chosen in the definition of the h\ For example, since the no-slip constraints are 
actually right invariant, one could have chosen a right invariant frame. But then the Hamil- 
tonian description of Brownian motion needs a correction term involving the Levi-Civita 
connection of jj. This approach has been taken in [23]. However, the basis dependence 
implies that the generator of (1.2) also changes when we pass to a different frame, and 
there would be many natural choices depending on whether the frame should be left or 
right invariant, adapted to the constraint distribution, or the direct product structure of G, 
etc. Even if one ignores these issues, it is not clear what to do if the configuration space 
is not parallelizable. For all these reasons we transfer the construction to the bundle of 
orthonormal frames itself. It is only then that the generator of the resulting 'constrained 




(1.1) 




(1.2) 
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Brownian motion' is basis independent. This constrained Brownian motion has some in- 
teresting features: 

• To visualize it, we can think of a microscopic robot (or ball, snakeboard, etc.) sub- 
ject to molecular bombardment. The robot thus experiences small impacts from all 
sides (isotropic in space) which force it to move around, but it still has to respect 
the constraints. 

• Now, it turns out, that the geometry of the constraints determines the probabilistic 
properties of the perturbed system. Indeed, if the constraints are integrable, then 
the robot's net drift will vanish. However, when the constraints are non-integrable 
and non-mechanical (which is the generic case) the Gaussian noise will induce a 
net drift on the robot. In Section 4 we quantify this drift in terms of the geometry 
of the constraint distribution. Mechanical constraints are given, by definition as 
level sets of conserved quantities, such as momentum maps. E.g., the constraints 
could be given by the horizontal bundle of the mechanical connection, which is just 
orthogonal to the vertical bundle in the case of a symmetry group action. 

• This leads to a dictionary between probabilistic aspects of the perturbed system 
and classical properties of the original (deterministic) non-holonomic system. See 
Theorem 1.2 below for a preliminary statement of this dictionary and Section 4 for 
further details. 

l.B Description of contents and results 

Since this paper addresses both the geometric mechanics and the stochastic differential 
equations communities, we shall give the necessary background for all concepts and quote 
the main results that are used later on. The paper is self contained. We briefly present the 
main results and the structure of the paper. 

Non-holonomic systems 

We start by recalling the necessary facts, concepts, and results of non-holonomic systems 
and their geometry. This includes a careful presentation of symmetries, reduction, and 
conditions for the existence of a (smooth) preserved measure. We will have to rephrase 
some of the existing results in view of applying them to our stochastic study later on and 
develop the theory in the direction needed in subsequent sections in the paper. 

Thus, we will have to give complete proofs not only for some of the known results, 
due to our reformulation, but we also need to establish new formulas. For example, the 
global formula (2.8) of the symplectic form on the tangent bundle given in terms of an 
underlying Riemannian metric on configuration space is new, as far as we know. In (2.12) 
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we introduce the above mentioned constraint force projection and explain its properties 
to prepare for Section 4. We also study Chaplygin systems, which are non-holonomic 
systems with a particularly rich geometric structure, and the symmetry reduction of such 
systems. One of the main points of Section 2 is the presentation of a certain one-from (3 
which, according to Proposition 2.5, characterizes the existence of a (smooth) preserved 
measure for a given Chaplygin system. This result has been previously derived in [8] but 
both our proof and our interpretation of the relevant one-form (3 are different. In fact, our 
formulation of (3 in (2.25) is a prerequisite for Section 4. 

Stochastic dynamics on manifolds 

First, we recall some notions about manifold valued stochastic differential equations and 
diffusions from [26, 18]. 

Then we study symmetries of Stratonovich equations. We consider a manifold Q to- 
gether with a proper action by a Lie group G and a diffusion generated by a Stratonovich 
operator S from TM^^^^ to TQ satisfying the equivariance relation (3.11). In this setting, 
the Stratonovich operator does not (in general) induce a Stratonovich operator on the base 
Q/G; however, the diffusion and its generator A'^ are projectable ioQ/G. Thus, there 
is an induced diffusion V^/'^ with induced generator A'^^'^ on the base space Q/G. See 
Theorem 3.2. 

Two examples for this procedure of 'equivariant reduction' are the Eells-Elworthy- 
Malliavin construction of Brownian motion (cf. equation (3.10)) on a Riemannian mani- 
fold and the stochastic Calogero-Moser systems (see [24]), as remarked in Subsection 3.B. 
In particular, we allow for non-free G-actions on Q and hence Q/G is, in general, not a 
smooth manifold but a stratified space. Thus, we extend the reduction theorem of [33, 
Theorem 3.1] to the case when the Stratonovich operator on the total space is not invari- 
ant but equivariant with respect to a symmetry group action. 

This naturally leads to the introduction, in Subsection 3.C, of certain notions of equiv- 
ariant diffusions, previously studied in [16, 17]. The material of this subsection will also 
be useful in Section 5. In particular, we prove a mean reconstruction equation for diffu- 
sions in principal bundles which is analogous to a concept by the same name in mechanics 
(see, e.g., [1, §4.3], [35, §3], [36, Theorem 11.8]) and uses that of [16, 17]. 

Non-holonomic diffusions 

This section contains the main results of the paper. We introduce constrained Brownian 
motion as motivated above. This involves a careful analysis of the underlying geometry. 
Then we study the generator and symmetry reduction of the resulting diffusion process. 
The reduction relies on Theorem 3.2. 
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The surprising fact in this regard, is that there is a very strong interrelation of some 
probabilistic aspects of constrained Brownian motion and certain deterministic properties 
of the original non-holonomic system. A first instance of this relation is: 

Theorem 1.1. Constrained Brownian motion is a martingale with respect to the non- 
holonomic connection on the configuration space. 

A second result yields a probabilistic characterization of the existence of a preserved 
measure which is a very important concept in the theory of non-holonomic systems (see 
[3, 6, 10, 15, 25, 22, 30]): 

Theorem 1.2. Let {Q,T>, L) be a G-Chaplygin system such that the base M := Q/G is 
compact. Let be the non-holonomic diffusion in M associated to these data. Then the 
following are equivalent: 

(1) (Q, P, L) has a (smooth) preserved measure; 

(2) r*^ is time -reversible; 

(3) has vanishing entropy production rate. 

The compactness assumption on M is met in all classical examples such as the Chaply- 
gin ball or the two-wheeled robot. This theorem sums up some of the results of Sections 4 
and 3.D, where also the relevant notions are introduced. 

Examples 

As examples, we consider the two-wheeled robot and the snakeboard. The former is G- 
Chaplygin and does (in general) not allow for a preserved measure. The latter is not a 
Chaplygin system but does fit the general set-up of Section 4. For both of these examples 
we consider also the stochastic perturbation of deterministic trajectory planning. This em- 
phasizes the way in which the noise couples with the constraints to produce a non-trivial 
drift vector field (the emergence of which is at the heart of the geometry of Section 4); 
this is in sharp contrast to stochastic Hamiltonian systems. Indeed, the Hamiltonian ana- 
logue of non-holonomic reduction is reduction at the 0-level set of the standard cotangent 
bundle momentum map, which reduces Brownian motion to Brownian motion in the base 
with respect to the induced metric. This is a manifestation of the idea that the amount by 
which a non-holonomic system differs from a Hamiltonian one can be measured by the 
amount by which the induced diffusion differs from Brownian motion - and vice versa. 

However, in the non-holonomic setting, the constraints induce a drift giving rise to 
drifted Brownian motion on the base space. This drift is quantified in Section 4 and we 
use it to make the perturbed motion follow a given curve on average. We show how the 



Geometry of non-holonomic diffusion 



7 



explicit form of the drift allows, in principle, for a simple numerical implementation to 
solve such a motion planning problem. It should be noted, though, that we have made no 
attempt to study stability or convergence properties of the resulting numerical algorithm. 
Similar problems have been treated, from a different perspective, in the engineering liter- 
ature; see [2, 42] and the references therein. 

2 Non-holonomic systems 

We recall some facts about non-holonomic and, specifically, G-Chaplygin systems. Then 
we give a necessary and sufficient condition for the existence of a preserved measure that 
is suitable for our applications in Section 4. 

A non-holonomic system is a triple {Q, V, C) consisting of a n-dimensional configura- 
tion manifold Q, a constraint distribution V C TQ which is smooth and of constant rank 
r < n (i.e., it is a vector subbundle of TQ of rank r), and a smooth Lagrangian function 
C : TQ — )■ M. The dynamics of {Q,'D,C) are given by the Lagrange-d'Alembert prin- 
ciple; see [3, 4, 6, 9, 10, 25, 30]. Throughout this paper, we assume that £ is the kinetic 
energy of a Riemannian metric fionQ. 

2.A Almost Hamiltonian formulation 

Since TQ 3 Uq ^ ji{q){uq,-) E T*Q is a vector bundle isomorphism covering the 
identity on Q, we shall identify the vector bundles TQ with T*Q. We follow [4] to give 
an almost Hamiltonian description of the dynamics of {Q, V, C). Let tq : TQ Q he 
the tangent bundle projection and l : V ^ TQ the inclusion. Define 

C := {Xu, eTV\uqeV, Tu^{tq o l) (X„J eV] = {T{rQ o L))-^ (V). (2.1) 

In standard vector bundle charts of TQ and TTQ, we write Uq as (g, q) and as 
(g, q, 6q, 5q), respectively. Since {tq o q) = q, it follows that T{tq o g, 5q, 6q) = 
(g, 5g) and hence C = {{q,q,6q,6q) \ {q,q),{q,6q) G 1)}, ker (T(rQ o i)(g, g, •, ■)) = 
{(g, g, 0, 5q) \ 5q E M"}. Thus C is a vector subbundle of TV of rank 2r. (If V is the hori- 
zontal subbundle of a principal connection of some proper and free G-action on Q then C 
is the horizontal space of the tangent lifted G-action on V. See (2.14) below.) According 
to [4, Section 5] we have 

{T{TQ)) \V = C®C^ (2.2) 

where := G T„,(TQ) \ Uq e V, l](n,)(X,„ r„J = 0,WY^^ G C} is the Q- 

orthogonal complement of C in (T{TQ)^ I'D; ^7 denotes the canonical symplectic form on 
TQ = T*Q. We will prove identity (2.2) later on, after the proof of Proposition 2.1. 

For reasons that will become clear in Section 4, we elaborate on (2.2). We use the 
Levi-Civita connection on TQ —^Qto decompose TTQ = Hor'" © Ver(rQ), where 
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Ver(rQ) = ker(TrQ : TTQ TQ) is the vertical and Hor^ C TTQ is the hori- 
zontal subbundle. Recall that a curve v{t) in TQ is horizontal if its covariant derivative 
■= A I f>\+'v{t + s) vanishes; here Pj+' : Tq^t+s)Q ^ Tg(^t)Q is the parallel trans- 
port operator of the Levi-Civita connection and q{t) := TQ{v{t)). Alternatively, since 
^ = Kit)MAt)^ or in coordinates, ^ = ^ + r;.,(g(t))^.^(t), the curve 
v{t) is horizontal if and only if in any standard tangent bundle chart 

A vector Xu^ G TuTQ is called horizontal if it is tangent to a horizontal curve. The 
horizontal space Hor^^ C Tu^TQ is the vector subspace formed by all horizontal vectors. 

lfug = q'^ e T,Q, the decomposition of a vector X^^ = A'£^ + B'^ e T^TQ 
in its horizontal and vertical part is 

(9 (9 / (9 (9 \ (9 

A^ — + 5^— = - n.g^A^— + (T'q^A' + B') — . (2.4) 

V 9q' dq' J ^ ' dq" 

Indeed, since T^tq [r^, + S%) = R'^ it follows that 

which shows that the second summand in (2.4) is vertical. The first summand is horizontal 
since it verifies the horizontality condition (2.3) (with = q\ = and ^ = 
—T^ji^q^ A''). In particular, note that Tu^tq : Cu^ nHor(^^ — )■ Vg is an isomorphism: A"^ — 
rj^.gM''^ e Hor(;^ maps to the given vector A'-^ G V^. Similarly Tu^tq : Hor(^_^ ^ 
TqQ is an isomorphism. Its inverse is the horizontal lift mapping which is often written 
as a map hF : TQ Xq TQ ^ Hor'^, iug,Vq) h-> (r„^rQ|Hor(^^)-i(t;g). Interpreting pr^ : 
TQ X Q TQ — )■ TQ as a vector bundle over TQ with base the first factor, makes : 
TQ Xq TQ ^ Hor^ into a vector bundle isomorphism covering the identity on TQ. 

Let K : Ver(rQ) — )• TQ Xq TQ be the inverse to the vertical lift mapping vl : 
TQ Xq TQ ^ Ver(rQ) defined by v\{uq, Vq) := J^j^^^ {uq + tWg), for all Uq, Vq G TgQ. 
In standard coordinates, K{q,q,{i,5q) = {q,q,q,Sq). In particular, K{Xug) G T^Q. In 
addition, Ttq : Hor^ TQ and K : Ver(rQ) TQ restricted to each fiber over TQ 
are linear isomorphisms. Let Phor and P^cr denote the horizontal and vertical projections 
associated to Hor'^. By abuse of notation, we sometimes write K also for K o P^gj, : 
TTQ — 7- Ver(rQ) — )• TQ. We have thus the vector bundle isomorphism over V 

C^{VxQV)®keTT{TQOi), (2.5) 
Xu, ^ {u„T,,TQ (P.e, {Xu,))), (2.6) 

+ vlK,Wg) ^ 1 {Ug,Vq,Wq), (2.7) 
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where we regard V xqV 3 {uq, Vg) Ug e V as a. vector bundle over V. Notice also 

that TV D ker T(rQ o ,) = U^^^^^^^ vl(,,„)P,. 

Proposition 2.1. The canonical symplectic form Q G Vi^{TQ) has the expression 

n{ug) = /.(g)(T„,rQ(X„J,ir(r„J) -/.(g)(T„,rQ(r„J,ir(X„J), (2.8) 

for any q e Q, Ug e TgQ, Y^^ G Tu^{TQ). 
Proof. In an arbitrary standard tangent bundle chart, we have 

n = ^P^q^dq' A dq^ + /i„dg* A dg^ (2.9) 
oqi 

where the Riemannian metric is written as = Hijdq^ dq^, with = jiji. Thus, if 

■ d d d ■ d 



we get 



fi(M,)(X„,, F„J = ^-0q\A'C^ - A^C') + - C'B^). (2.10) 



On the other hand, T^M^u,) = A'i-„ T^M^u,) = C'ii and 



V dq' dq' J ^ ' dq" 

by (2.4) and the definition of K. Therefore, 

/i(g)(T„,rQ(X„J,ir(F„J)-/i(g)(T„,rQ(r„J,ir(X„J) 
= iiijTlf^cfiA'C'' - A'^C) + ^lij{A'D^ - C'B^) 



2 \^ dq^ dq^ dq^ J dq^ 
+ fiijiA'D' -C'B^) 



-r.^grsq'iA'C'' - A^C') + ^cf{A'C^ - A^C') + ^i,{A'D^ - C'B^] 



= ^q'{A'C^ - A^C') + ^ii,{A'D^ - C'B') 

because r^^, is symmetric and (y4*C^ — A''C^) is skew-symmetric in {i, k). However, this 
expression coincides with (2.10) which proves (2.8). □ 



10 



Simon Hochgerner, Tudor S. Ratiu 



Thus by (2.8) we get 

= {x^^ e T^^iTQ) \ u, e V,, (2.11) 

since K^T^^tq : C^^ — )• are surjective, where C is the /z-orthogonal of 
V and the vector bundle isomorphism in the last line of (2.11) is given by Xu^ i— > 
(Mg,T„,rQ(X„J,P^er(X«J)- TMs cxprcssion of and (2.1) show that C n = {0} 
which proves (2.2). 
In particular, if 

P ■.{T{TQ))\V = C®C^ (2.12) 

is the projection along and 11 : TQ = V ® — )■ "D is the orthogonal projection then 
it follows that 

T{tqol)oP = IIoT{tol). (2.13) 

Indeed, using the above description of C and C^, this follows immediately by decompos- 
ing {T{TQ)) \V into its horizontal and vertical parts. 

Let 1-L be the kinetic energy Hamiltonian on TQ which we regard as the Legendre 
transform of C. Then the dynamics of the non-holonomic system (Q, C) are given by 
the vector field 

:= PXn G X{V) 

where X^ is the Hamiltonian vector field of 1-L. More generally, for a function / G 
C°°{TQ) we regard Xj := PXj G X{V) as the non-holonomic vector field of /. Let 
f]'' denote the fiberwise restriction of L*n to C x C. Then (2.2) implies that ^2'' is non- 
degenerate and we may rewrite the defining equation for Xj as 

where (df^ is the fiberwise restriction of L*{df) to C. 

2.B (^-Chaplygin systems 

Now we shall consider the case when the non-holonomic system is invariant under a 
group action such that the constraints are given by a principal bundle connection. A G- 
Chaplygin system consists of a Riemannian configuration space {Q,fi), a Lie group G 
with Lie algebra g which acts freely and properly on (Q, fx) by isometrics, and a principal 
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bundle connection A E VI^{Q]q) on tt : Q Q/G =: M. For ^ G Q denote by 
E X{Q) the infinitesimal generator defined by 

d 



dt 



for all q E Q, where exp : g — )■ G is the exponential map. 

The Lagrangian of this system is the kinetic energy C := ||| ■ ||^. It is also assumed 
that the constraint distribution is the horizontal subbundle of the connection A, i.e., V := 
keiA C TQ. Thus the system (Q,V,C) is a non-holonomic system and the dynamics 
are determined by the Lagrange-d'Alembert equations; see [3, 4, 6, 9, 10, 25, 30]. It is not 
assumed that V is orthogonal to the vertical space ker Tn. 

Since V is the horizontal subbundle, it is invariant with respect to the tangent lifted 
G-action on TQ. Thus we obtain a principal G-fiber bundle V ^ V/G = TM. This 
bundle carries an induced connection l.*t*A, where l : V ^ TQ is the inclusion and 
r : TQ Qh the tangent bundle projection. Its associated horizontal bundle is 

ker(r o l)*A = {ug E TV \ T{t o L)uq E\ieiA = V} = C. (2.14) 

Let fiQ denote the induced Riemannian metric on M := Q / G . Then the isomorphism 

T.TT : ^ (T^(,)M,/io(7r(g))) 

is an isometry for the indicated inner products for aWq E Q. 

2.C The non-holonomic correction 

In order to carry out non-holonomic reduction we need to introduce a two-form on TM 
induced by the momentum map and the curvature Curv"^ E ^'^{Q; q) of the connection A. 
As we shall see in the next subsection, this form is the correction that one needs to subtract 
from the canonical symplectic form in order to give an almost Hamiltonian formulation 
of the reduced non-holonomic system. To define this form, we need three ingredients: 

(i) The adjoint bundle: Let G act on Q x g by the (free and proper) action given by 
9 ■ {(1,0 ■= {a ■ ?,AdgO' for all ^ G G, g G Q, ^ G 0, and let g := Q g = 
[Q X 0)/G be the orbit space. Elements of g are denoted by [g, ^]g- The projection 
P • 3 [<?! 0\g ^ 7r(g) G M defines the adjoint vector bundle whose fibers are Lie 
algebras. 

(ii) The curvature on the base: Curv^ G Vi'^{Q]q) naturally induces a two-form on 
Curv;^ G i7^(M;5) on the base M with values in the adjoint bundle q by 

Curv^(7r(g)) (Tg7r(ug),T,7r(t;J) := [g, Curv^(g)(ug, Wg)] ^ 

for all qEQ, Uq, Vq E TqQ. 
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(iii) The momentum map of the tangent lifted G-action: : TQ — )■ q* is defined by 
(JG(Mg), = Mi'^q^ ^q(?)) for all ^ G 0, G TQ, and is equivariant. 

To get a grip on the non-holonomic correction two-form, we begin describing it if G 
is a commutative group. Then the adjoint bundle is trivial: p : q = M x q ^ M is the 
projection on the first factor. Thus, Curv^ G i7^(M; g) and we define the non-holonomic 
correction two-form S G n^(TM) by H := ( o hl"^, Curv^), that is, 

Z{u,) (X.,,r„J := (Jg (hl^.K)) ,Curv^(x) (T„^rM(X„J, T„^rM(r„J)) (2.15) 

for all X G M, G T,M, X,^, G T^^TM), where hl^, := (T,7r|I?,)-' : T,M ^ 
C TQ, X = vr(g), is the horizontal lift operator associated to the connection A. 
The pairing on the right hand side of this formula is between q* and q. The right hand 
side of this formula seems to depend on g G Q- However, this is not the case because 
the horizontal lifts at two distinct points in Q are related by a group element and the 
momentum map is invariant under the G-action (since G is commutative). 

As stated, this formula does not make sense for general Lie groups because the mo- 
mentum map is 0* -valued and the curvature on the base is g- valued so the pairing makes 
no sense. However, the idea for the general formula is based on (2.15). We define S G 
fi2(TM) by 

:= (JG(hl^(«.)), Curv^(g) (hl"^, (T„, Tm (X„ J ), hl"^, {T^.^Tm{Y^M) (2-16) 

for X(^x^u),Y{x,u) G T(^a:,u)iTM) and q G 7r"^(x); since both entries in this pairing are 
G-equivariant the ambiguity cancels out, that is, the right hand side in (2.16) does not 
depend on q but only on 7r(g) = x. 

Due to the importance of this formula we make a few additional comments. Recall 
that the momentum map Jq '■ TQ — )■ q* is equivariant with respect to the coadjoint action 
on Q*. The tangent lifted G-action restricts to an action on "D C TQ; indeed V = kevA 
is the horizontal subbundle and is hence G-invariant. Corresponding to the G-principal 
bundle projection V ^ V/G = TM there is a natural connection which is induced from 
the connection ^ on Q Q/G, namely L*r*A where l : D ^ TQ is the inclusion 
and T : TQ -^■ Q is the tangent bundle projection. The curvature of l*t*A is 6*r*Curv'^ 
which is equivariant: /*i*r*Curv^ = Adg o (i*r*Curv'^), where Ig : V ^ V is the 
action of (? G G. Thus the two-form (J^, i*rQCurv'^) defines a G-invariant two-form 
on V. This two-form is, moreover, horizontal: since i*r*Curv'^ is a curvature form on 
V V/G it vanishes upon insertion of vertical vectors, whence the same holds also for 
(Jg, 6*r*Curv'^). Thus the two-form (Jg, t*rQCurv'^) is basic and hence drops to a well- 
defined two-form E on V/G = TM. Implementing the computations suggested above 
gives (2.16). 
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2.D Non-holonomic reduction 

Identify TQ with T*Q by the metric fi and TM with T*M by the metric ^iq. Consider the 
orbit projection map 

Tn\V -.V ^V/G = TM. 

We may also associate a fiberwise inverse to this mapping which is given by the horizontal 
lift mapping hl"^ '■ Q^m TM — )■ V associated to A. The following statements are proved 
in [4, 15, 25]. 

Proposition 2.2 (Non-holonomic reduction). The following hold. 

(1) Vf descends to a non-degenerate two-form finh on TM. 

(2) finh = VLm — 2 G VL^{TM), where = —d9M is the canonical symplectic form on 
TM and H is the non-holonomic correction two-form given by (2.16). 

(3) Let h : TQ be G -invariant. Then the vector field is Tt[\T> -related to the 
vector field on TM defined by 

where ho : TM — )■ M is the induced Hamiltonian. 

In general, Vt^b is an almost symplectic form, that is, it is non-degenerate and non- 
closed. We will denote the reduced Hamiltonian by "He and refer to the almost Hamiltonian 
system (TM, VL^h, V.c) as the reduced data. The identity il^h = — 2 appears for the 
first time, albeit not completely explicitly, in [4]. A proof using moving frames is given 
in [15] where it is also called the "(J, _ft')-formula". A different proof following the above 
outline is contained in [25, Prop 2.2]. 

2.E The preserved measure 

Does (TM, i7nh7 "He) possess a preserved measure? This is an important question since 
it says something about the possible existence of asymptotic equilibria and also plays a 
prominent role in the theory of integration of non-holonomic systems. Correspondingly, 
this topic is touched upon in all of [3, 6, 10, 15, 25, 22, 30]. In [8], a necessary and 
sufficient condition for the existence of a preserved measure in terms of local coordinates 
on the base manifold M is given. We derive derive below an equivalent formulation of 
this result which is more closely adapted to the Riemannian structure on M. This point of 
view will then be exploited in Section 4 in the stochastic context. 

For brevity we will denote X = X^ in this subsection. Let fi"^, m = dim M, be the 
Liouville volume on TM. Then there is a preserved measure for thefiow ofX if and only 
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if there is a strictly positive function J\f : M ^ M. such that {Af o tm)^"^ is preserved, 
that is, 



see [8] for a proof. In such a case J\f is called the density of the preserved measure with 
respect to the Liouville volume. As shown in [8, Remark 7.4], it suffices to consider 
density functions on M. 

To reformulate condition (2.17) we want to use the fact that (M, /iq) is a Riemannian 
manifold. Hence we equip TM with the Sasaki metric a associated to /io (see, e.g., [20]), 



for all Xu^,Yu^ G Tu^{TM), where tm ■ TM — )■ M is the tangent bundle projection and 
Km ■ Ver(rAf) -> TM x m TM is the inverse of the vertical lift map vU/ : TM x m 
TM ^ Ver(rj\/); note, in particular that Km{Xu^) G T^M. We recall some of the key 
properties of the Sasaki metric; see [20] for proofs. 

(i) The Sasaki metric o is the unique Riemannian metric on TM such that tm '■ 
(TM, a) — 7- (M, /io) is a Riemannian submersion, that is, the isomorphism 



is an isometry (for the indicated inner products) for all Um € TTM, where -La 
denotes the perpendicular relative to the Sasaki inner product a{um) on (TM). 

(ii) Hor and Ver are cr-perpendicular complements of each other: Hor = Ver"'"'^. 

(iii) The vertical hft map vl : TM x m TM -> Ver C TTM is an isometry of vector 
bundles over TM, thinking of the projection onto the first factor pr^^ : TM x m 
TM — )■ TM as a vector bundle over TM and /io as a vector bundle metric. 

Given a vector field X on M we shall denote its horizontal lift relative to the Rieman- 
nian metric /ig by X'' G X{TM, Hor) and its vertical lift by X'" G X(TM, Ver). 

Lemma 2.3 (The non-holonomic vector field). If Xq = X-^^ = ^l^{dHc) G X(TM) is 
the standard Hamiltonian vector field, X = ^l~^{d'Hc) G X{TM) is the non-holonomic 
vector field, and {ui, . . . ,Um} is a local orthonormal frame on M, then the following 



(2.17) 



(2.18) 



Tu^tm ■■ (jker Tu^tm)^" = Hor„„, (T(Mm) j — > (TmM , no^m)) 



hold: 



Tu^tm{X{u^)) 



Tu^tm{X^M) = for all G TM, 



(2.19) 



m 




(2.20) 



i=l 



m 



J]((Jg o hl^,)(n,), (Curv^ o A\\-^g){u„ u,{x))) <(n,) 
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where the second equation holds locally in the domain of definition of the given frame, 
and is well-defined independently of the choice ofq G 7r^^(x). 

Proof We begin by noting that 1]m(^o,>") = dHc(F) = n„h(^,^) for all Y e 
X{TM). Hence by Propositions 2.2 and 2.1 

- fio{x){T^^rM{YM), K^SX - X^)M). 

This implies, firstly, that Tu^tm{X{ux)) = Tu^tm{Xq{ux)) = since He is the kinetic 
energy Hamiltonian of the induced metric ji^. Secondly, since K^^ {u'"^{ux)) = Ui{x) = 
Tu^tm {u'^iu^)), we find locally 

= - ° hl-^).(«x), (Curv^ o A\l'^,){ux, u^{x))) 

where we have used T^^tm = in the last line. □ 

Let volcr be the volume form on TM induced by the Riemannian metric a. We shall 
prove the following formula: 

voU = ^f^"^. (2.21) 
Indeed by (2.9), denoting by &m the permutation group of {1, ... , m}, we have 

= (/i,,dg* A dq^r 

= Yl -"i-d) ■ ■ ■ l^rnnMdq' A dg^(i) A ... A dg™ A dg^^™) 



J2 {-lf^''>Mi) ■ ■ ■ lim.im) dgi A dgi A . . . A dg™ A dg'^ 



= m\ det(^,j)dg^ A dg^ A . . . A dg™ A dg'". 

On the other hand, in the coordinates (f ^, f ^, . . . , y2m-i ,y2m^ TM, where t>^'~^ = g* 
and f = g* for i = 1, . . . , m, we have by the usual formula of the Riemannian volume, 
vol^ = v^det(a/j)dgi A dg^ A ... A dg"^ A dg*^, where au := a Since, by 
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the definition (2.18) of the Sasaki metric we have fiij = a (^7^, i^j = a y-^, and 
a ^ j = 0, it follows that the matrix (ct/j) is of the form 

for a permutation matrix P. Therefore, A/det (ct/j) = det(/ijj) which proves (2.21). 
By (2.21), condition (2.17) holds if and only if 

£xii^f o rM)volJ = ^ (d(log AT o tm), X) + div^i^ X = 0. (2.22) 

Let us define 

L{TM) := {I e C^{TM) : k := I \ T^M : T^M M is linear for all x G M} 

and consider the prescription $ : L{TM) ri^(M), = lx{ux), G T^M, 

which is an isomorphism of C°^(M)-modules. 

Lemma 2.4. The following statements hold. 

(1) div™i„X G L(TM). 

(2) Let {ui I i = 1, . . . , m} denote a local orthonormal frame on M. Then 

m 

divvou X{u,) = -Yl ^i^ii^)) [x{ux),u1m') (2.23) 

i=l 
m 

= - 5^ ((Jg o h\-^,){u,{x)), {Curvf o A2hl-^,)(M,, u,{x))) . 

i=l 

Proof. Clearly (2) implies (1) so we shall prove (2) below. 

We use the Levi-Civita connection V^" to split TTM = Hor © Ver where Ver = 
ker T(r : TM — > M). Given a vector field X on M we shall, as before, denote its 
horizontal lift by G X{TM, Hor) and its vertical lift by G X{TM, Ver). 

Let {ui G X(M) \ i = 1, . . . , dimM}, be a local orthonormal frame for TM. Then 
{(wf , u^) \ i = 1, . . . , dim M} is a local orthonormal frame for TTM with respect to a. 
By Lemma 2.3, if Xq = X^^ = i7^/(d'Hc) £ X(TM) is the standard Hamiltonian vector 
field, then we can locally express X as 

X = Xo - ^((Jg o hl-^,){u,), (Curv^ o a\1-^,){u,,u,{x))) <(m,). 

Notice that Xq preserves O'" whence diVvoi^Xo = 0. According to, e.g., [20, Proposi- 
tion 7.2], it is true that VhU^ = (V(!°Mi)'' and V^u"^ = where V is the Levi-Civita 
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connection of a. Therefore, sometimes suppressing the base point Ux for readability, 

m 



i=l 



m 

= J] ( - a(Xo, V;.«f ) + «,V(Xo, ) + <a(Xo, <) 
- <((Jg o hl-^,)(«,), (Curv^ o A2hl^,)(n,, 

m 

= div™i„Xo - J2 {i3Goh\-^)g{u,{x)), {Curvf o A2hl-^,)(M,M,(x))) 

m 

= - ^ ((Jg o hl-^),(^z,(x)), (Curv^ o A'hl-^,){u, Ui{x))) . 
1=1 

To see that the formula does not depend on the particular choice of g G n^^{x) one 
uses equivariance of the involved expressions together with the observation that any G- 
ambiguity cancels out in the pairing. □ 

Therefore, diVyoi^X can be turned into a one-form on M through the canonical iso- 
morphism $ : L{TM) n^{M). Let us define 

13 := -$(div™i^x) G Q\M) il.lA) 
for the non-holonomic vector field X = . 

Proposition 2.5. The system {TM,Q^i^,'Hc) admits a preserved measure if and only if 
(3 G Q^{M) is exact. If (3 = dF for some function F on M then Af = is the density of 
the preserved measure for the Liouville volume. 

Proof By Lemma 2.3, (d(log A/'or^), X)(m^.) = d(log A/')(m^.) forw^. G TM and hence 
$ (^(d(logA/' o tm), X) j = d{\ogM). Now by (2.22) a preserved measure {M o rM)l^M 
exists if and only if 

d(logA/-) = $((d(logAr),X)) = -<l>(div™i^X) = /3, 

i.e., /3 G ri^(Af) is exact. □ 

As stated above, this result is proved in [8, Theorem 7.5] but our interpretation of the 
form (3 is slightly different. The formula 

m 

f3{x)M = 5^H(Mi(x))(xK),Mf(M,)) (2.25) 

1=1 

m 

= ((Jg o hl^,)(M,(x)), (Curv^ o a\\^^){uxM^) 



i=l 
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Ux e T^M, in a local orthonormal frame {ui, . . . , Um} will be useful in Section 4 below. 

3 Stochastic dynamics on manifolds 

3.A Diffusions on manifolds 

This subsection is a review of some necessary definitions and results which are all con- 
tained in the books [26, 18]. 

A diffusion is a continuous stochastic process which has the strong Markov property. 
This is a concept which can be formulated in any topological space. 

Diffusion processes 

Let X be a locally compact topological space with one-point compactification X = X VJ 
{oo} and endow X with its Borel cr-algebra B{X). Define W{X) to be the set of all maps 
w : [0, oo) — > X such that there is a C(^) ^ [0, oo] satisfying 

(1) w{t) e X for all t e [0, C{w)) and w : [0, ({u;)) -> X is continuous; 

(2) w{t) = oo foralH > ({w). 

Let / G N, < ti < . . . < ti e R+, A c Il-^^X a Borel set, and consider the 
evaluation mapping ev(ti, . . . , t/) : W{X) — )■ Il[^^X, w h-)- {w{ti), . . . , w{ti)). Then 

S = ey{t^,...,tl)-\A) 

is called a Borel cylinder set in W{X). If t > and ti < t then 5 is a Borel cylinder set 
up to time t. 

The set W{X) is equipped with the cr-algebra B{W{X)) generated by all Borel cylin- 
der sets in W{X). This cr-algebra has a natural filtration given by the family of 

{BtiW{X)))t>o 

which are the cr-algebras generated by Borel cylinder sets up to time t. 

A family of probabilities {Px)xex i^i^)^ ^i^i^))) is said to be a system of dif- 
fusion measures on {W{X), B{W{X)), Bt{W{X))) if it has the strong Markov property, 
the definition of which we will give shortly. 

A {Bt{W{X)))t-stopping time is a random variable r : W{X) IR+ = IR+ U {oo} 
such that {w e W{X) \ t{w) <t}e Bt{W{X)) for all t e M+. 

For s G ]R_|_ we define the time shift operator 

E, :1^(X) — vW{X), w^iEsW.t^wis + t)). (3.1) 
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A family of probabilities {Px)xex '^^ O^i^)^ ^i^i^))) satisfies the strong Markov 
property if, for all X G X, (i3t(W^(X)))t-stopping times r, bounded i3,(iy(X))x-B(iy(X))- 
measurable functions F : W{X) x W{X) — )■ M, and s E IR+, we have 

/ F{w,Tj^(^y,)w) Px{dw) = / I / F{u,w) Pu{r{u)){dw)\ Px{du). 

Jt{w)<oo Jt{m)<oo \JW{X) / 

(3.2) 

See [41, p. 249]. 

Let (n, T , P) be a probability space and F : 17 x IR+ — )► X a map. Then T is called 
a stochastic process if : — )■ X, w Tt{ui), is a random variable for all t G IR+. 
Define F : i— ?► (t t— t- F((a;)). Then F is said to be a continuous stochastic process in X if 
f : (17, J") — )■ (W^(X), ;B(iy(X))) is a random variable. (Below, when X is a manifold, 
we will only be dealing with continuous processes.) Note that, for all tu E ^l, the paths 
[0, C{t{u))) 3 t H-> Tt{uj) G X are continuous; the map ( : W{X) — )■ [0, oo] was part of 
the definition of W{X). 

The law of F is, by definition, the push-forward probability f *P on (W{X), B{W{X))), 
i.e., r,P{S) = P(r-\S)) for all S E B{W{X)). 

The process F : x IR+ — > X defined on the probability space (17, J^, P) is a diffusion 
in X if it is a continuous process and there is a system of diffusion measures {Px)x£x 
such that f ,P = P^ as probability laws on iW{X), B{W{X))y, here 

P^{S) := [ Px{S)fx{dx) for all 5 G i3(Vr(X)) 
Jx 

and /i = (Fo)*P : B{X) [0, 1] is the initial distribution of F. 

A diffusion F in X with associated system of diffusion measures {Px)x is said to be 
generated by a linear operator A on the Banach space of continuous functions C (X) with 
domain of definition A C C(X) if, for all x G X, t > 0, and f E A, the stochastic 
process M/ : W{X) M, 

Ml{w) := f{w{t)) - f{wm - [\Af){w{s)) ds, 

Jo 

is a P^-martingale on iW{X), B{W{X))) for the filtration {BtiW{X)))t>o. In this case, 
A is called the generator of F. See [26, Defs. IV.5.3 and IV.6.2]. The definition of a 
martingale is recalled below. 

Let (17, J^, P) be a probability space. A family {J^t)tm.+ of sub-cr-algebras Tt ^ 
is called a reference family if it is increasing, i.e., Tt C J-'s for < t < s, and right- 
continuous, i.e., n£>oJ^f+e = Tt for all t G M+. Whenever we mention [Tt) we will 
suppress the index set IR+, tacitly assume that it is a reference family, and refer to it as the 
filtration of T so that (17, T , [Tt], P) becomes a filtered probability space. 

A stochastic process M : 17 x ]R_|_ — )• ]R_|_ is called a martingale on (17, J^, (J-t), P) if 
the following conditions are met: 
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(1) Mt -.n^Ris integrable for all t G M+; 

(2) Mt : n ^ M. is J^t-measurable for all t E M+, i.e., M is (Tt) -adapted; 

(3) E[Mt\J^s] = Ms for alU > s > 0, i.e., E[{Mt - M,)xf] = for alU > s > and 
all F E J^s^ where xf is the characteristic function of the set F. 

\fW:VLyi ]R+ — )■ M'^ is the (a fortiori continuous) diffusion defined on the filtered 
probability space {VL,F, (J^(),P) with initial condition Wq = a.s. and with genera- 
tor = ^ ^ didi, then W is called a{J't)-adapted Brownian motion. See [26, Exam- 
ple IV.5.2] or [41, Remark 7.1.23] for this characterization of Brownian motion. Below, 
we will be concerned with Brownian motion on a Riemannian manifold and then this 
aforementioned characterization will be taken to be the definition of Brownian motion. 

Diffusions via Stratonovich equations 

Let (fi, J^, (J^i), P) be a filtered probability space as above and suppose now that X = Q 
is a manifold. From now on, all stochastic processes will be assumed to be continuous. 

If is manifold then a Stratonovich operator S from TN to TQ is a section of 
T*N® TQ N xQ. Equivalently, we can view 5 as a smooth map S : Q x TN — TQ 
which is linear in the fibers and covers the identity on Q. Let Xq, Xi, . . . , X^ be vector 
fields on Q and define the associated Stratonovich operator S : Q x TR.'^'^^ — > TQ by 



where x eQ,w e {w, w') e T^R^^^ = {w} x M^+\ {d \ i = 0,1, . . . ,k} is the 

orthonormal standard basis in M'^"'"^ and ( , ) is the standard inner product in M*^"^^. We 
note that the number k is not related to the dimension of Q. 

Consider the stochastic process F : x IR+ — )■ ]R''+\ (i, (t, Wt{u)) where W 

denotes (J^^) -adapted Brownian motion in M^. 

We will be concerned with Stratonovich equations of the form 



a continuous (J^t)-adapted process F : i7 x ]R_|_ Q is called a solution to (3.3) if there 
is a (J^f)-adapted Brownian motion W = (W^) in such that, in the Stratonovich sense. 



k 




i=0 



6T = S{Y, T)6Y; 



(3.3) 




(3.4) 



for all smooth functions / G C°°{Q) with compact support. 
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A few comments are in order. The definition of (3.3) is (3.4). The second integral in 
(3.4) is the Stratonovich integral (signaled by saying that the equation is to hold "in the 
Stratonovich sense"). We will not go into the elaborate definition and construction of the 
Stratonovich integral here and refer to [37, 26, 18] for the definition and an in depth study 
of the Stratonovich integral. 

For the readers more familiar with Ito calculus we add the following remarks. Ito in- 
tegrals are defined by a Riemann sum approximation with the rather essential difference 
that, in the sum, one evaluates the integrand at the left end-points of the partition intervals. 
The Stratonovich integral, on the other hand, can be obtained by evaluating the integrand 
at the mid-points of the sub-intervals. While Ito integrals give rise to a new transforma- 
tion rule (the Ito formula), transformed Stratonovich integrals obey the same change of 
variables formula as Riemann integrals. This is the essential reason why, on manifolds, 
Ito calculus is replaced by Stratonovich calculus. Concretely, the Stratonovich integral is 
characterized in [26, Thm. III. 1.4] by 



where A is a partition = to<^i<---<'^n = ^ with maximal step size |A| and 
l.i.p. is "limit in probability"; here X, Y are quasimartingales (a general class of semi- 
martingales). We do not go into more details of the definition of the Stratonovich integral 
here and refer the reader to the above mentioned books. 

Suppose r is a solution to (3.3) such that Vq = x a.s. and V satisfies (3.4) with 
respect to an M^'-valued Brownian motion W defined on a filtered probability space 
(fi, J-", {J-'t),P). Then we will write T = r^'^ to remember these data. The explosion 
time C of a solution F^'^ is a stopping time on (fi, J^, (J-'t)) with the following property: 
the path F^g'!^j(u;) is contained in Q for all T < ({u) but if ({u) < oo then F^g''^^j^(c<;) 
is not contained in any compact subset of Q. The following is a partial account of [26, 
Theorems V.1.1 and V.1.2] and [18, Theorem (7.21)] that is sufficient for our purposes. 

Theorem 3.1. Let the assumptions be as above and consider equation (3.3). 

(1) For each initial condition, Tq = x a.s., and continuous {Tt) -adapted Brownian mo- 
tion W, a solution T^'^ exists and is unique up to explosion time. 

(2) Let Px := F^'^P. Then is independent ofW and (Px) is a system of diffusion 
measures generated by the second order differential operator 

k 





(3.5) 



which acts on the space C°°{Q)o of smooth functions with compact support. 
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Assume that Q is endowed with a linear connection V : X{Q) x X{Q) — )■ X{Q), 
where X{Q) denotes the Lie algebra of smooth vector fields on Q. 

If / G C^iQ), its Hessian is defined by Hess(/) := VV/, i.e., Hess(/)(X,F) = 
X{Y{f)) - {VxY){f) for any X,Y e X(Q). The Hessian is bilinear in X and Y but is 
not symmetric, unless V is torsion-free. 

Let r be a diffusion in Q with generator A. Then the drift of T with respect to V is 
defined to be the first order part of A which is determined by V. If A is of the form (3.5) 
then this is Xq + | E Vx,^i- 

According to [18, Theorem 7.31], the A-diffusion T is a martingale in {Q, V) if and 
only if A is purely second order with respect to V, i.e., the V-drift vanishes. In [18] this 
is stated for torsion-free connections but it is noted that one can use the same definition 
for connections with torsion. 

If (Q, n) is a Riemannian manifold then an A-diffusion is called Brownian motion if 
A = where A := divgrad = —Sd is the metric Laplacian; see [26, Def. V.4.2] or 
[18,Def. 5.16]. 

To construct Brownian motion in (Q, /i), we need the principal connection 

on the orthonormal frame bundle p : d Q over (Q, fi), uniquely induced by the Levi- 
Civita connection on Q (whose Christoffel symbols in a chart are denoted by T\i). We 
recall its construction and basic properties. Let u G 5^ with base point p(u) = q E Q. 
Then we define the horizontal bundle as Hor'^ = LI«ej-Hor^, where the horizontal space 
at u, a vector subspace of Tud, is given by 

Hor- :=T,a(T,g); (3.6) 

here a is local section of p : ^ Q such that a{q) = u and V^cij = for all X G TgQ 
and local vector fields ctj := cr(ej) with {ei,. . . , e^} being the standard basis in R.^. 
We may express (3.6) in local coordinates {q\ ) defined on a bundle coordinate patch 

U xV ^^as 

Restricting u to this coordinate patch, it can be written as pr2 + aJioc : T{U xV) ^ so{d), 
where wioc = (w^ ) G Vl^{U; so{d)). It follows that, in terms of the Christoffel symbols, 

= Tiiu]dqK (3.8) 

Thus, the local expression of the horizontal lift defined by uj is \i\^u ( ) = ~i^~^ki'^''j'^.- 
An orthonormal frame u E ^ can be regarded as an isometry m : M"' — > Tp(u)Q^ where 
d = dimQ. Define the canonical horizontal vector fields Li G X(5^, Hor"^), i = 1, . . . ,d. 
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by 

Li{u)=hru{u{e,)), (3.9) 

where hl'^ : X{Q) — j£(5^) is the horizontal lift map of cu. If (W'') is Brownian motion in 
M"^ and T solves the Stratonovich equation 

6T = ^Li{T)6W' (3.10) 

then p o r is a diffusion in Q with generator |A^, that is, a Brownian motion. This is 
explained in [26, Chapter V.4] and follows also from Theorem 3.2 below; the essential 
observation in this context is that the Stratonovich operator S{u, w, w') = S{u, w)w' = 
^ Li{u){ei,w') of (3.10) is equivariant: 

S{ug,g'^w,g''^w') = ^ {hruiu{gei)){ei, g'^w'))g = S{u,w,w')g 

for the principal right action of the structure group, i.e., g E 0(d). Lideed, this follows 
since : ^ XqTQ kerco = Hor"^ C T'^ is 0(rf) -equivariant. 

To connect with Theorem 3.2, the principal right action can be turned to a left action 
via inversion in the group. 

3.B Equivariant reduction 

Equivariant reduction is a natural extension of the reduction theory of [33, Theorem 3.1]. 
While the results of [33] are stronger, in the sense that they provide a Stratonovich equa- 
tion on the base space, they are only applicable when the original Stratonovich operator 
is G-invariant (i.e., equivariant with respect to the trivial action on the source space). By 
contrast, the observation in equivariant reduction is that although the upstairs Stratonovich 
operator is not projectable, the diffusion still factors to a diffusion in the base and the 
downstairs generator is induced from that of the original diffusion on the total space. 

Two immediate examples are the construction of Brownian motion on a general Rie- 
mannian manifold as well as a stochastic version of Calogero-Moser systems (see below). 
For both of these cases, the diffusion upstairs is defined in terms of a Stratonovich operator 
which is equivariant but not projectable. 

Let (fi, J^, (Tt), P), Q, Xo, Xi,...,Xk e X{Q) and 6T = S{Y, T)6Y be as before. 
Suppose there is a Lie group G which acts smoothly and properly on Q from the left. We 
continuously extend this action to the one point compactification Q by requiring oo to be 
a fixed point. Let tt : Q Q/G he the projection and G°°{Q)'^ denote the subspace of 
G-invariant smooth functions on Q. Note that Q/G need not be a manifold; in general 
Q/G is a topological space which is naturally stratified by smooth manifolds (see, e.g. 
[14, Chapter 2]). 
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In the following all actions are tangent lifted, where appropriate, without further no- 
tice. Generally, for Lie group actions, we will interchangeably use the notation g ■ q and 
9Q- 

Theorem 3.2. Given is a group representation p : G ^ 0{k) and let 0{k) act on ]R^+^ = 
M X M'^ such that the first factor is acted upon trivially. If the Stratonovich operator S 
satisfies the equivariance property 

S {gx, p{g)y, p{g)y') = gS (x, y, y') (3.11) 

for all {x,y,y') G Q x TIR'^^^, then the diffusion T induces a diffusion n o T in Q/G. 
Moreover, the generator A of the diffusion T on Q preserves G°°{Q)'-^ and the induced 
generator Aq of the diffusion n o T on Q / G is characterized by 

7r*iAof) = A{7T*f) (3.12) 
for all fe G^{Q/G) := {/ G C{Q/G) : tt*/ g G^{Qf}. 
Proof Let us begin by noting that gV"'^ = T9=''P(9)w _ indeed, 

5(^71-'^) = gS{Y,T^^'^)6Y = S{p{g)Y, gr^^'^)6{p{g)Y) 

whence T := gT^'^ satisfies Tq = gx a.s. and 5T = S{p{g)Y,T)5{p{g)Y). By existence 
and uniqueness of solutions the claim follows. In particular, we have n o Y^'^ = n o 

Ygx,p{g)W _ 

Claim: 

Pgx=g*Px (3.13) 

where G acts on W{Q) as g : w {t gi^it))- To see this, let 5* C W{Q) be a Borel 
cylinder set. This means that there are / G N, < ti < ... < ti G IR+, and a Borel 
set A C n'Q such that S = ev(ti, . . . , ti)-^{A), where ev(ti, ...,ti) : W{Q) U^Q, 
w ^ {w{ti))l^^. From the identity (T'''P'-9^^)\P = (r^'^);P we find 

P,,{S) = (r^^''^(^)^);P(S) = P{co : {Tf^''^'^'^ G A} 
= P.(ev(ti, . . . , ti)-\g-'A)) = P^ig-'S) 

which proves (3.13). 

Consider the push forward map vr^, : W{Q) — )■ W{Q/G), w i-)- vr o It is straightfor- 
ward to see that B{W{Q/G)) = 7t,B{W{Q)). For So = 7r.{S) G B{W{Q/G)) we may 
write the law {P[x]) ^^^^g^a of vr o r as 

P[.](^o) = (vr o r^^''''(^)^);P(So) = P,.{n;\So)). 



By (3.13) this does not depend on g ^ G. 
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Let us show that the system {P[x]) 



lx]eQ/G 



satisfies the strong Markov property, hetp : 



71, : W{Q) ^ W{Q/G), [x] e Q/G,r: W{Q/G) ^ R+hea{Bt{W{Q/G)))t-stoppmg 
time, and F : W{Q/G) x W{Q/G) ^ M a bounded Br{W{Q/G)) x B{W{Q/G)) 
measurable function. Then p-\Bt{W {Q / G))) C Bt{W{Q)) mdp*T = Top: W{Q) 
M+ is a {Bt{W {Q)))t-stoppmg time. For s E IR+ let be the time shift operator defined 
in (3.1) and observe that {T,spw){t) = pw{s + t) = n{w{s + t)) = {pT.s'w){t). (We use 
the same notation for the time-shift on W{Q) and that on W{Q/G).) Now, since P[^^ is 
the push forward of P^ via p, we can use the strong Markov property of {Px)x to conclude 
that 



J{t{uo)<oo} ^JW{Q/G) ^ 

which, according to (3.2), shows that {P[x])[x] is strong Markov. 

To show that ^ XiX.f e G'^iQf for all / G C°°(Q)^ consider the standard basis 
{eo, Ci, . . . , Cfc} of M X M'^. For j = 1, . . . , A; we find 

g ■ Xj{x) = g ■ S{x,y,ej) = S{gx, p{g)y, p{g)ej) = ^gkjXk{gx), 




k 



where gkj := (cfc, p{g)ej) is independent ofxeQ. Since Y.j 9ij9kj = ^ik. 



Xiigx) = ^gijgkjXkigx) = ^g^jg ■ Xj{x). 



Thus (d/(X,)) (gx) = 9^, (dfiXj)^ (x) for / G G^{Qf and also 

d(df{Xi)ygx)oTxg = rf( ^ <7,,d/(X,)) (x) = (7,,ci(d/(X,)) (x). 



j j 



This implies that 



J2 [x.Xjygx) = J2 {d{df{xS){gx),X,{gx)) 



i i 



= 5Z {9ijd(df{Xj)^ (x) o {T^g) ^,gik{Txg) ■ Xfc(x)^ 



= 5^(m/)(x). 
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Similarly, it is also easy to see that Xq is G-invariant. Thus the generator A = Xq + 
I ^ XiXi acts on C°^{Q)^, whence it induces a projected operator Aq characterized by 

A O TT* = TT* O Aq. 

Finally, to see that Aq is the generator of tt o F we need to show that, for all t E IR+, 
[x] e Q/G, and / e C'^{Q/G)o, the M- valued process 

M/ : W{Q/G) — y M, 

M/H := f{w{t)) - f{wm - [\Aof){w{s))ds 



is a P[^] -martingale on {W{Q/G),B{W{Q/G)) for the filtration {Bt{W{Q/G)))t. See 
[26, Def. IV.5.3]. This means that for alH > 0, s G [0,t], and A e Bs{W{Q/G)) we 
should check that (see [41, Chapter V]) 



M 



13s{W{Q/G)) {w) P[^]{dw) = / Mf{w)P[^idw 



E^i-i denotes the expectation on (w{Q/G), B{W{Q/G))) with respect to Pm. Indeed, 



M 



Bs{W{Q/G)) {w)Py,^{dw 



Mi{w)Pi^]{dw) 



{fMl){u)P^{du) 



Bs{W{Q)) {u)P,{du 



Mff{u)P^{d 



u] 



{p*Ml){u)P,{du) 
Ml{w)PUdw). 



Here, Mj'^ : W{Q) — )• M is analogously defined to M/. We have used that ^ is a 
P^. -martingale with respect to {Bt{W{Q)))t for all x G Q and that p*M/ = which 
holds because of {A^f) o vr = A{'K*f). □ 



Stochastic Calogero-Moser systems 

To construct classical trigonometric or rational Calogero-Moser models one can take the 
configurations space Q to be a (real or complex) semisimple Lie group G or a semisim- 
ple Lie algebra q, respectively. The metric /i on Q is then accordingly given by the 
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(essentially unique) bi-invariant (pseudo-)metric in the group or the Ad-invariant non- 
degenerate bilinear form in the Lie algebra. Thus one obtains a G-invariant Hamiltonian 
system (T*Q, ^l'^, H) where fi*^ is the canonical symplectic form on T*Q, H is the ki- 
netic energy Hamiltonian, and G acts by the cotangent lift of the conjugation action or 
the adjoint action, respectively. The resulting Calogero-Moser system is then realized 
by passing to the (singular) symplectic quotient of (T*Q, Q,^, T-L) with respect to the G- 
action. See [28, 19, 21]. In other words, Calogero-Moser systems are obtained by reducing 
the Hamiltonian description of geodesic motion on the Riemannian manifold {Q, jji) with 
respect to its obvious symmetry group. 

Here we propose the stochastic analogue of this construction which should consist 
of reducing the Hamiltonian construction of Brownian motion on (Q, jj) with respect to 
the G-action. To this end, we consider the Hamiltonian version in [32] of (3.10). Using 
the left trivialization we may write TQ = Q x g (recall that Q = G or Q = g) and 
choose an orthonormal basis Lj of q with respect to the Ad- invariant inner product (■,■); 
suppose from now on, for simplicity of exposition, that G is compact. We obtain a q- 
valued Hamiltonian 

H = {W) ■.T*Q = QxQ* ^Q, iq,p) ^ J](p,L,)L,. 

The Hamiltonian version of Brownian motion is determined by the associated Stratonovich 
equation 

i 

where W is Brownian motion in g = M". It is shown in [32] that r o F is Brownian motion 
in {Q, jj,), where r : T*Q Q is the projection. In the left trivialization T*Q = Q x q* 
the Hamiltonian H is nothing but the projection onto the second factor when q and q* 
are identified. Clearly, H is not G-invariant but it is G-equivariant for the Ad-action on 
0. It is easy to see that the same is true for the Stratonovich operator (q,p;w,w') i— )■ 
Xjji (g, p) {Li, w'). In fact, we are ultimately concerned with the Stratonovich equation 
S{toT) = S{W,T)6W = ^SW'Li = 51^ and now it is evident that 5(^-g,^-(u;,w')) = 
Ad{g)w' whence we need the Ad(G)-action on (w, w') to make the Stratonovich operator 
S : Q X Tq ^ TQ equivariant for the respective actions. Thus the above theorem applies 
and we obtain a diffusion tt o r o F in the (singular) space Q/G when n : Q ^ Q/G is 
the projection. 

This construction has been carried out in [24] where it is shown that the associated 
stochastic Hamilton- Jacobi equation of [34] is related to the quantum Calogero-Moser 
Schrodinger equation of [38, 39]. 

The issue of equivariant reduction leads immediately to the setting of [16, 17]. There, 
one of the topics treated is that of a diffusion on the total space of a principal bundle such 
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that the diffusion factors through the projection and the generator induces also a generator 
on the base. 

3.C Reconstruction of an equivariant diffusion 

Proposition 3.3 that we shall state and prove in this subsection will be used in the examples 
considered later on. Before we can state it, we need to recall some notions of [16, 17]. 

Let 71 : Q Q/G =: Mbea left G-principal bundle with connection form A E 
^1^{Q; q). Denote the horizontal and vertical spaces by Hor and Ver, respectively. Assume 
that r is a diffusion in Q generated by a Stratonovich equation 

m k 

6r = Xo{r)6t + ^X,(r)5iy'^ + vo{r)6t + ^t;„(r)(55" (3.14) 

a=l a=l 

such that Xo, Xi, . . . , X^ are basic, f o, f i, . . . , f ^ are vertical vector fields, and (W, B) is 
Brownian motion in ]R™+'^ with respect to the underlying filtered probability space. Thus 
the generator of F is 

By construction, this generator can be decomposed as follows. There are Fq? ^i? • • • 7 ^ 
X(M) such that Xq = hl^(Fo), • • • , X^ = hl-^(y^) and := vr o Tt is a diffusion in M 
with generator A^'^ = + | XI ^aYa- Note that n* o A*^ = A^ o n*, that is, A^ is 
projectable. Moreover, A'^ decomposes into a horizontal part A'^ = Xq + ^aXa and 
a vertical part A" = Vo{T)6t + Y!1=i vJB'^. 

In [16, 17] one of the main points is that, assuming a non-degeneracy condition, the 
induced operator A^' gives rise to a connection m -n : Q ^ M with respect to which 
the operator A'^ can be decomposed. In our applications the connection is given by the 
problem and the decomposition into horizontal and vertical part arises naturally. 

We are going to use the observation of [16, 17] that, for q E Q and Tq = q a.s., the 
diffusion F can be written as 

T, = gf-x1. (3.15) 

Here xj* is the diffusion in Q with generator A'^ and Xq = q a.s. That is, is the horizontal 
lift of the A^'^ diffusion xt. The process gf' in G with gfg'' = e a.s. can be written as the 
solution to a time-dependent Stratonovich equation: for w E W{Q) we define 

6gr = T^Rg^ {A,^.^M9r ■ wt) + J2^aT-^M9r " ^T))- (3-16) 

Here Rg : G ^ G is the action by right multiplication of G on itself. Equation (3.15) is 
reminiscent of a well-known concept in mechanics and can be viewed as a reconstruction 
equation (see, e.g., [1, §4.3], [35, §3], [36, Theorem 11.8]). 
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Let Qe{w) be the law on W{G) of (3.16). This depends only onn o w e W{M). Let 
Xt be an A^^-diffusion path in M with horizontal lift x'^. Consider the evaluation map 
evt : W{G) G, g{-) ^ g{t). We call 

E^=("')[ev] -x^-.t^ E«=("')[evi] ■ x'l 

the mean reconstruction of the sample path xt. 

From now on, we shall assume that G can be realized as a matrix group G C GL(A^) C 
M.^^ . The flat connection on thus induces a connection V on G. For X G C 0t(iV) 
we denote the associated left- and right-invariant vector field by L^X{g) = TeLg{X) = 
gX G s((iV) and R,X{g) = T,R,{X) = Xg E st(iV). 

Proposition 3.3. If x E W{M) is an A^^ -sample path and vq, . . . ,Vk are G-invariant 
vector fields then the expectation i?'3'=(^'')[evf] =: c(t) associated to the mean recon- 
struction of an A^^ -diffusion path x in M is given as the solution to the left-invariant 
time-dependent ODE 

k 

a=l 

Proof. We can use G-invariance of the vector fields together with the equivariance prop- 
erty Agq{gUq) = Ad{g)AqUq, Q E Q, Uq E TqQ, of thc principal bundle connection form 
A to rewrite the defining equation (3.16) as 

Sgf =T,R^^n (t,{R-^\oL^^.) (^A,n(^vo{x1)6t + J2M4)SB''))) ■ 

Letting 

9{t) = {gitYnU ■.= gf, 

a{t)^ = (a(t)r)mn := A^nVaix^) G gi{N), 

b(t) = (bitrimn := A^nVoixl) G Qi{N) , 

this becomes with the summation convention, for Z = 1, . . . , A^, the Stratonovich equation 

6g^ = (gLarSB'^ + glr^'St) 

\ / n=l 

in when we think of g^ as a column vector and suppress the time-dependency. The 
associated Ito equation in is, for / = 1, . . . , A^, 

dg' = [gLaTdB- + (^^6- + \al-glaZ)dt) . 

\ / 71=1 

(See e.g. [37, Equ. (6.1.3)] for the conversion rule of Stratonovich equations to Ito equa- 
tions.) This is a linear time-dependent Ito equation in M^. Hence, the mean motion is 



30 



Simon Hochgerner, Tudor S. Ratiu 



found by erasing the martingale term in the corresponding integral equation. This implies 
that the expected motion of g is given by 

c'{t) = iE[g]it) = E[g]{t) {b{t) + ^a(t),a(t),) 

which is an equation in GL(A^). Since a{t)aa{t)a = V L,a{t)aiL^a{t)a){e) the claim fol- 
lows. □ 



3.D Time reversible diffusions 

The references for this section are [31, 29, 26]. Let (M, /i) be a Riemannian manifold and 
r an y4-diffusion in M where 

A=lA+lb (3.17) 

with A the Laplace-Beltrami operator and b a vector field. Let p{t, x, y) denote the tran- 
sition probability density of V (the minimal fundamental solution - see [29]). If vol^j is 
the Riemannian volume form on M then, for (t, x, S) E IR+ x M x B{M), the transition 
probability of T is 

P{t,x,S)= / p{t,x,y)Yo\f,{y). 
Js 

This quantifies the probability that a diffusion path starting at x is in 5 after time t. The 
diffusion T is said to be symmetrizable if there is a smooth function > such that 

X, y)4>{x) = p{t, y, x)(f){y) for a\\t,x,y eR+ x M x M (3.18) 

in which case T is called (p-symmetric. 

A probability measure on M is an equilibrium measure if z/ = 1 and 

Pit, X, S) — > iy{S) as t — ^ oo 

for all (x, S) E M X B{M). Equilibrium measures, if they exist, are unique. If u = 0vol^ 
is an equilibrium measure then we refer to (p as the equilibrium distribution. 

The diffusion V is called time-reversible if its law coincides with that of the time- 
reversed process; this means that for each T > the law P[o,t] of [0, T] x 1] — )■ M, 
(t, oj) I-)- Tt{u) is the same as the law P^j.^ of [0, T]xVt ^ M, (t, u) i-)- TT-t{^^)- 

The adjoint operator A* associated to A is given by 

A*f = lAf - |div^(/6) 

where / G C°°(M). Here, the adjoint is with respect to the inner product (/, (?) = 
Jm fd following result is essentially due to Kolmogorov. 

Theorem 3.4 ([31, 29, 26]). With notation as above the following are true. 
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(1) The A-dijfusion V is symmetric if and only if b is a gradient. Moreover, if b = 
grad(log 0) then T is (p-symmetric and A*(p = 0. 

(2) r is time reversible if and only if it is symmetric and has an equilibrium distribution 
(f), in which case T is ip- symmetric. 

(3) IfM is compact then an equilibrium distribution always exists. 

(4) If M is compact then the unique equilibrium distribution is characterized by the 
equations /^^ 0vol^ = 1 and A*(f) = 

Compactness of M is satisfied in important examples such the Chaplygin ball or the 
two-wheeled carriage studied in Section 5. 

Assuming that M is compact, [27, Chapter 5] give various equivalent conditions for a 
diffusion of the form (3.17) to be time-reversible. One such condition is that the diffusion 
have vanishing entropy production rate 

limli7(P[o,T],P[o,T])- (3-19) 

The relative entropy H{ii, u) of two probability measures /i, z/ on a measure space (W, B) 
is defined as (see [27, Definition 1.4.3]) 

Hi a v) := //nlog£/i(rfw) if « and log g G L\dii)- 
\ +00 otherwise. 



4 Non-holonomic diffusions 

Consider a non-holonomic system (Q,V,C) as in Section 2. This section is concerned 
with the study of non-holonomic diffusions on V which should be given by a Stratonovich 
equation of the form 

6T = X^6t + S^{T,W)6W. (4.1) 

Here describes the dynamics of the deterministic system, W is Brownian motion 
in M*^, d = dimQ, and 5''(r, 1^)51^ should be interpreted as a noise term that stems 
from constrained Brownian motion. This is in analogy to [32, Section 3.1] and [5] where 
Hamiltonian diffusions are introduced. However, equation (4.1) does not make sense, in 
general, unless the configuration space is parallelizable. The problem to be considered 
below is to make this equation precise and to study the notion of constrained Brownian 
motion on manifolds. 
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4.A Constrained Brownian motion 

Let {Q, V, C) be a non-holonomic system with symmetry group G as in Section 2. 

Let p : 3^ — > Q be the orthonormal frame bundle over (Q, //) and denote its structure 
group by := 0{d) and its Lie algebra by I := 5o{d). The Levi-Civita connection 
on TQ gives rise to a uniquely determined principal connection u E Q} (5^; I) on the 
principal bundle p ■ d Q', denote by Hor"^ = kerw C T'^ its horizontal subbundle. 
Equip 5 with a i^-invariant metric u such that p becomes a Riemannian submersion and 
Hor'^ and Ver(p) := ker Tp are perpendicular. Since G acts by isometrics on {Q,p), it 
lifts to an action on and we may assume u to be G-invariant; e.g., we could take u 
to be the Sasaki-Mok metric (see the survey [40]). We can use the connection to lift the 
constraints to a subbundle c defined via the natural w-dependent vector bundle 
isomorphism 

T>^=. (5^XQl?)©Ver(p). 

To understand this definition and the isomorphism consider the bundle morphism over ^ 
defined by 

^XqV® Ver(p) ^ Hor" © Ver(p) =^ T^, 

As before, hl"^ ■ Xq TQ Hor'^ is the horizontal lift mapping associated to oj. Now 
the subbundle is defined as the image of this morphism. 

Thus (5', T)'^, ^ 1 1 ■ I is a new G-invariant non-holonomic system covering {Q, V, C) 
in the following sense: The G-action lifts to an action on and there is an induced space 

defined by 

C5^:={eeT(PS^)|rr5(e)eP5^}, 

where : ^ is the tangent bundle projection. Again, we split T(T^)\{V^) = 

C^® (C^)^^, where il^ is now the canonical symplectic form on =^ T*^ (the tangent 
and cotangent bundles of ^ are identified via the Riemannian metric u on Q), and : 
CJ© (C^)^^ — ^ denotes the associated projection. The situation is summarized in the 
following commutative diagram: 

T{Td) I (V^) =Cd® (CJ)^^^ ^ (4.2) 

TTp 

TTQ\V^^^=C®C^ — >C. 

Indeed, this diagram is commutative since TTp{C^) = C and we may regard (TQ, fi) as 
the symplectic reduction of {T^, fig) with respect to the K-action at 0. Li particular, 

TTp (P, iX,,j)) = P (TTp iX,,f)) = e XiV) 
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for any / G C^{TQ). 

According to Section 3 we can construct Brownian motion on {Q, fi) by fixing Brow- 
nian motion W = (Wi) in R.'^, d = dim Q, and the Hamiltonian 

H-.Td^M.^ {u,r])^{u{r],L,{u))) = {H\u,r])), (4.3) 

where the Lj are defined by (3.9). This gives rise to the Stratonovich operator 

where {ei, . . . , ea} is the standard basis in R'^. If solves = S^{W, T")6W then 
o solves (3.10) and p o o is Brownian motion in {Q, /i). 

Definition 4.1. We define constrained Brownian motion to be the process 

in (5, where T^^ is a process in solving the Stratonovich equation 

5T^^ = P^{T^^)S"{W, T^^)5W = J2 X^J,{r^^)6W\ (4.4) 

Let (V^)^ be the //-orthogonal of and : T'^ = ® (T)^)^ be the 

orthogonal projection. Similarly, we define 11 : TQ = V (B T)-^ — )• V and we note that 

UoTp = Tpo n^. (4.5) 

Equation (2.13) implies that o F''^ is a diffusion in ^ generated by the Stratonovich 
equation 

The associated Stratonovich operator will be denoted by S^^. It is explicitly given by 

S^^ ■.dy<TR'' ^V^, (u;u;,u;') ^^'^^ 

where {ei, . . . , e^} is the standard basis in R'^. 

Henceforth, local orthonormal frames on (Q, jj) and local sections of p : ^ ^ Q will 
be identified. 

Theorem 4.2. The process F"*^ is a diffusion in Q (in the sense of Section 3) and its 
generator A has the form 

in a local orthonormal frame u = (ui) on [Q, p). 
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Proof. By Proposition 3.2, to see that F"*^ is a diffusion we need to show that 

S^^{u ■ k, w, w') = S^^{u, kw, kw') ■ k (4.7) 

for all G K. Here, u ■ k denotes the principal right action oi k E K on u E ^ and 
condition (4.7) is equivalent to (3.11) since one can invert a right action to obtain a left 
action. Indeed, (5 x g I^) © Ver(p) and the definition of u imply ohr = h^ o n 

and therefore 

d 

S'^^{u ■ k, w, w') = J2 ■ k)K■kHke^)){e^, w') 

i=l 

^hl-(n(p(u))u(eO)(e.,W)) - A; 



,?=i 



which proves (4.7). 

In order to calculate the generator A, let / G C°°{Q)o and u E ^. Then 



1=1 

d 

i=l 
d 

i=l 

d d 
i=l i=l 

i=l i=i V t=o / 

= \ J](nM,)(nM,)/ - i Y.^uuj{uu,)u,)f 

i=l i=l 
d d 

iJ](n«,)(n«,)/-iE(nvL.«.)/, 



2 

where we have frequently dropped the base points to simplify the notation. 

Alternatively, the above calculation can be done in local coordinates u = (x*, e*) on 
^5". Then, using the summation convention, Ur = e^dm and Il^m = nj^c)„ with dm = 
The Christoffel symbols are given as usual by V(^^9j = T^jdi. Using now the local 
coordinate description (3.7) of Hor"^ it follows that 



Geometry of non-holonomic diffusion 



35 



This yields 

which immediately yields the formula for A in the statement of the theorem. □ 

The second term in the above formula for A is reminiscent of the non-holonomic 
connection V*. This is defined as the linear connection 

V"'' : x(g) X x(g) ^ x(Q), (x, y) ^ v^r - (v^n)y. (4.8) 

If y is a section of V, one obtains the useful identity V^Y = IIV^F. Let Hess"*^ be the 
Hessian of 

Corollary 4.3. The non-holonomic diffusion F"*^ is a martingale in Q with respect to the 
non-holonomic connection V"''. 

Proof. Since A{p* f){u) depends only on p{u) (this follows from Theorem 3.2 but can 
also be checked directly), we may take a local frame u = (ui) = {ua,Ua) which is 
adapted to the decomposition V © T)-^ such that Ua are local sections of V and Ua are 
local sections of V-^. Then A = ^ ^{uaUa — W^^Ua) = | XI Hess"'^(-)(-Ua, Ua) which is 
purely second order, by definition. □ 

4.B (S'-Chaplygin diffusions and stochastic non-holonomic reduction 

Continue to assume that the non-holonomic system {Q, V, C) is invariant with respect to 
a free and proper action of a Lie group G. Denote the projection hy n : Q ^ Q/G = M. 

The G- and ii'-action on 5^ commute. Thus, we may form the product action ofGxK 
on 5^. Since the from (4.3) are G-invariant, it follows that the Stratonovich operator 
(4.6) satisfies condition (3.11) with respect to the trivial G-representation on M"'. There- 
fore, r'''^ induces a diffusion 

nopoT^oT^^ = no T"*^ =: 

onM := d/{G X K). 

Now we make the additional assumption that the constraints are of Chaplygin type, 
i.e., V is the kernel of a principal connection one-form A E ^1^{Q;q). The non-holonomic 
connection (4.8) on Q induces a connection on M which will be referred to as the non- 
holonomic connection V^"'^ on M; it is given by 

: X(M) X X{M) — > X{M), {X,Y) ^ Tvr (nV|^j^^(hl-^F)) 
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where hr : X(M) ^ X(Q,I?) (the space of vector fields on Q with values in the vector 
subbundle V C TQ) is the horizontal lift map of A. Recall from §2.B that the Riemannian 
metric fi on Q naturally induces a Riemannian metric /io on the quotient M := Q/G. 
To calculate the generator A^^ of T^^ , take a local orthonormal frame u = (ua) on M. 
Similarly as in the proof of Corollary 4.3, the generator becomes 

A'' = I Y.^UaUa - ^'>«) + \ Y.^^>'^ - ^nUa) = ^A^" + lb (4.9) 
whore b = j:(yZua-Vtiua). 
Lemma 4.4. b = fiQ^(3 where (3 is defined by (2.24). 

Proof. Essentially this formula is a special case of [30, Proposition 8.5]. For convenience 
we provide a proof by using a local orthonormal frame {ua) on M . Let K = ( o Curv"^ G 
n'^iQ, TQ) be the curvature of C o ^ G Q\Q, TQ) where ( : q 3 ^ ^ e X{Q) is the 
fundamental vector field mapping of the G-action. Then 

/io(V(^°Ma,Ub) = -fIo{[Ua,Ub],Ua) = - fi(h\-^[Ua, Ub],h\-^Ua) 

= niKihl-^Ua, h\'^Ub),h\-^Ua) - /i([hl-^Ma, h\-^Ub],h\-^Ua) 
= -KCcmv^{hl-^u^M^u,)M-^Ua) + fl{UV'^^^^h\-^Ua,h\-^Ub) 
= -{J{h\-^Ua),Curv\h\-^Ua,h\-^Ub)) + l2o{VtiUa,Ub). 

Therefore, 

fioiVZua - Vtiua) = {J{h\-^Ua),Cury\h\-^UbM-^Ua))fio{ub, -) 

where is the horizontal lift of the local vector field Ua G Xioc(M) to G Xioc{TM) 
with respect to the Levi-Civita connection V^". (We have identified linear functions on 
TM and one-forms on M as we did in the definition (2.24).) □ 

Theorem 4.5. The G-Chaplygin system {Q, V, C) has a preserved measure if and only 
if the associated diffusion F*^ is symmetric. Moreover, if b = grad''°(logA/^) then the 
diffusion is N -symmetric and Af is the density of the preserved measure of with 
respect to the Liouville volume. 

Proof. Using (4.9) and Lemma 4.4 this is a direct consequence of Proposition 2.5 and 
Theorem 3.4. □ 

When M is compact, then we infer from Section 3.D that measure preservation of 
the deterministic system is equivalent to time-reversibility of F^ which in turn is equiv- 
alent to the vanishing of the entropy production rate (3.19) of F^^. Moreover, if & = 
grad''°(logA/') then 

(/ UvoX^, 
\Jm 
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is the (unique) equilibrium distribution of T . For most systems of practical interest, 
such as the Chaplygin ball, the two-wheeled robot, or the snakeboard, the manifold M is 
compact. 



5 Examples 

5.A The two-wheeled robot 

The configuration space of the two-wheeled robot is 

Q = S'xS'x SE(2) = {{^\^P^,x,y,9)}. 

Here (V"^, measure the positions of the wheels with the orientation such that the robot 
goes forward when the wheels go backward, and (x, y, 9) give the overall configuration 
of the robot in the plane. Let G = SE(2) and M := x = Q/G. We use almost 
exactly the same notation as [11, Section 5.2.2]. It is assumed that the two wheels can 
be controlled independently and roll without slipping and without lateral sliding on the 
plane. The Lagrangian C of the system is the kinetic energy corresponding to the metric 

jj, = Jwid^jj^ ® dip^ + dijj^ ® dip"^) + m{dx ® dx + dy ® dy) 

+ rriQl cos9{dy ® d9 + d9 ® dy) — niQl sin 9{dx ® d9 + d9 ® dx) + jQd9 ® d9. 

Here m = uiq + 2mw, rrio is the mass of the robot without the wheels, m^, is the mass of 
each wheel, is the moment of inertia of each wheel, Jo is the moment of inertia of the 
robot about the vertical axis, and / is the distance from the vehicle's center of mass to the 
midpoint of the axis which connects the two wheels. Let 2c denote the distance between 
the contact points of the two wheels with the ground, and R the radius of the wheels. The 
constraints are given by the kernel of the q = -valued one-form 



A 



/dx + yd9 + ^ cos 9{dil)^ + dip'^) + - 
dy - xd9 + I sm.9{d'4)^ + d^)"^) - x^{dip^ - dip^ 
y d9 + f^{d^^ - d^^) 



Thus the constraint distribution isV = A ^(0) = span{,^i, ^2} where 

:= d^i — ^{cos9dx + sm9dy + ^de) and ^2 := — ^{cos9dx + sm9dy — \do). 

(5.1) 

Symmetry reduction 

Since ^ is a connection one-form for the principal bundle n : Q ^ M = Q/G, the 
system (Q, V, C) is of G-Chaplygin type. Let J : TQ — )• q* be the momentum map of 



38 



Simon Hochgerner, Tudor S. Ratiu 



the G-action. Then a calculation shows that 

{J{q,v'^^ + v%),Cvirv^{^^^,^^2)) = § {v' - v') . 

Note that this vanishes if / = 0. Let us apply the Gram-Schmidt orthonormalization 
scheme with respect to the reduced metric no to d^i,d^2 and denote the result by ui,U2. 
Thus, 

_ 1 

"1 = I "I" '" 

and 



Mi= J^ + m^ + JoU^ d^i (5.2) 



U2 = Jw- — ^7j2 [0^2 ^j^^^d^i I . (5.2a) 

Using the relation 



^^{b) = /3 = /io I ^(J(ui),Curv' 



'■^{Uj,Ui))Uj, 

and expanding everything in terms of (9^i, 9^2, one finds that the drift vector h equals 



Since M is compact, this h cannot be the gradient of a function for / ^ 0. Thus, we can 
conclude that the deterministic two-wheeled robot does not have a preserved volume for 
/ 7^ and that the associated stochastic system is not time-reversible. 

Kinematics of the noisy cart 

Formula (5.3) seems to imply that the stochastic cart (with zero initial velocity) acquires 
a tendency to go backwards when the center of mass is displaced towards the rear. To 
see that this is indeed the case we should check that the horizontally lifted mean curve 
coincides with the expected motion of the cart. 

Since TQ =^ T*Q (vector bundle isomorphism induced by the Riemannian metric /i 
on Q) and TTQ are trivial, we may view TQ C TTQ as a vector subbundle, so that = 
hl'^(6) and become vector fields on TTQ. Then the stochastic dynamics = {qt,Pt) 
on T) are generated by the operator 

a 

or by the Stratonovich equation 
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Here was defined in Section 2. Now in local coordinates {q\pi) on TQ the stochastic 
equations of motion are 

6{pioTf) = dpaPXn{qt,Pt))St, 

5{q' o Tf ) = {pt)M + W {b\Qt)) 5t + dq' 

If the initial conditions are (go, 0) then the solution is given by (qt, 0) where qt satisfies 

6qt = lb\qt)6t + J2'''(^^^^^''- 

a 

Let qt = {ipj,i;f,xt, yt, Ot) = (ql). By [37, Lemma 7.3.2] we have 



E[ql\=q', + E 
Rewriting 



J2<^a = ^(6^1 + 66) + 5(66 + 66), 

a 

with 

jy.^ /^(6,6) /^(6,6) 



M6,6)'-M6,6)^' • M6,6)^-M6,6)2' 

and noting that ^2) and ^1) = /i(6, 6) are constants, implies that u^u^q^ 
for (g*) = (■?/'\ ij]"^, X, y, 6). Thus 



dt 



E[qt] = \h\E[qt]) 



i.e., E\qt\ is the horizontal lift of the integral curve of |6 G X(M). 

Therefore, constraints and noise couple to produce a backwards drift of the robot. 
We emphasize that this is a stochastic non-holonomic effect which does not appear in a 
Hamiltonian setting. Indeed, the Hamiltonian reduction of Brownian motion at the zero- 
momentum level yields Brownian motion and this is consistent with the fact that the 
reduced two-wheeled robot system is actually Hamiltonian when / = 0. 

Trajectory planning for noisy wheels 

Generally speaking, consider a non-holonomic system (such as the cart) and assume that 
it is controlled so as to follow a predefined smooth curve c{t) E Q, t E [0, T] when no 
noise is present. When the system is stochastically perturbed we may ask whether c(t) is 
also the expected motion of the perturbed system. 

Suppose we want to steer the robot so that it follows a predefined curve in the plane. 
As a curve we consider the circle C of radius p > centered at the origin. The initial 
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configuration of the robot should be {xo,yo,Oo) = {p, 0,71/2) and the vehicle should 
go around the circle in the positive sense. It is assumed that the wheel speeds can be 
individually controlled. 

In this section we consider the example of [42]. Here the wheels are subject to a 
Gaussian white noise which is modeled by the Stratonovich equation 

ST^ = ^/Thd^iSW^ + ^/D2d^25W'^ (5.4) 

in TM where (VT*) is Brownian motion in and Di > are constants. In this setup 
one assumes that the controlled vehicle is not affected by the kinematics of the prob- 
lem, thus effectively forgetting the metric p. The generator of F^^ is ^{Did^i + D2d'^2)- 
Equation (5.4) lifts to a Stratonovich equation 

= ^/d1^i6W^ + ^/d~2^2SW^ 



in TQ. This is in accordance with the general theory of [16, 17]; the generator of F'^ is 
= ^{Di^l + -02^1) which can be regarded as the horizontal lift of A*^. Consider the 
deterministic input vector field 



u 



on M where the control 



m 



2t, 

) t~T 



0<t < y^=:ti 
ti<t<T:=^ + ti 



is chosen such that the unperturbed robot traverses the nominal curve C exactly once and 
initial and final speed are 0. The equation for the controlled noisy robot is thus 

and the corresponding (time-dependent) generator is = + hl'^(M) whence by [37, 
Lemma 7.3.2] 



E[fm)] = f{T^,)-E 



(A« + hl-^(«))(/)(F^)d. 



(5.5) 



for / G C°°{Q). (The expectation is taken with respect to the underlying probability.) Let 



(0,0 



yt, 



iD2-Di)R'^ 
8c 
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Using (5.5) we find 

E[xt] = K[ E[cos{es)]ds + p [ X{s)E[sm{es)]ds 
Jo Jo 

= K [ e"*cos(^(s))cis + p / X{s)e''' sm{e{s)) ds 
Jo Jo 

E[yt\ = -K E[sm{es)]ds + p X{s) E[cos{e s)] ds 
Jo Jo 

= -k[ e^' sm{9{s)) ds + p [ A(s)e'^* cos(^(s)) c/s 



where 9{t) differs from 9t and is defined by 

9{t) = 



t^ + f, 0<t<^/f=:ti; 



This determines the orientation of the vehicle. 

We have solved for {E[xt], E[yt]) using Maxima and its built in Runge-Kutta scheme. 
Here is a plot: 



1 

0.5 



-0,5 
-1 

-1 -0,5 0,5 1 1,5 2 2,5 3 

X 

The data are p = 1, Di = 1.2, D2 = 0.8, R = 0.3, c = 0.1. we have 
plotted the accelerating and braking parts of {E[xt], E[yt]) as blue and 
red, and the accelerating and braking parts of the unperturbed controlled 
robot {x{t),y{t)) as green and magenta, respectively. 

The discrepancy between the deterministic trajectory and the mean curve of the perturbed 
system is quite obvious. This phenomenon has also been observed in [42] by means of 
numerical simulations, and [42] have also proposed a trajectory planning algorithm which 
takes the perturbation into account. When comparing the above picture to that of [42], it 
should be noted that we have chosen a different convention for the orientation of the 
wheels. 




mean, acc, 
mean, braking 
det., acc, 
det., braking 
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5.B Microscopic snakeboard under molecular bombardment 

This is not a G-Chaplygin system but does fit the set-up of Section 4.A. 

In describing the snakeboard we follow mostly the presentation of [9]. There is, how- 
ever, one difference: the metric which we use to define the kinetic energy is that of [7]. 
This considerably simplifies some of the formulas. We further assume that the angle of the 
front axis equals minus that of the back axis. Thus the configuration space of this system 
is 

Q = S'xS^x SE(3) = {q= (0, ij, X, y, 9)}. 
The constraint distribution is the kernel of the M^-valued one-form to = (101,102) given by 

UJi{q) = — sm{9 + (f))dx + cos{9 + (t>)dy — r cos{(j))d9, 
'^2{<l) = — sin(6' — (j))dx + cos{9 — 4>)dy — rcos{(j))d9 

where 2r is the distance between the axes measured from their respective midpoints. Thus 

V = span {drp, d^, s := ad^ + hdy + cde} 

where the functions a, 6, c are given by 

a = — r ( cos(0) cos(6' — 0) + cos(0) cos(6' + 0)) , 
h = — r(cos(0) sin(6' — 0) + cos(0) sin(^^ + 0)), 
c = sin(20). 

Let m be the mass of the board, Jo its moment of inertia, and J^, J^, Je the moments 
of inertia corresponding to rotation about the angle 0, tp, and 9 respectively. Then the 
Lagrangian of the system is the kinetic energy of the metric 

/i = m{dx®dx + dy®dy)+Kd9®d9 + J^d<j)®d(j) + J^d'il>®d'il) + J^{d'il)®d9 + d9®d'il>) 

where K := Jq + + J^. 

Let us assume that the snakeboard is perturbed by white noise. Using the left trivial- 
ization of TQ this can be modeled by a Stratonovich operator of the form 

S -.Qx TR^ — > TQ, 

{q,w,w') I — V a^{e„w')ui5W' 

where {ui) is a left invariant orthonormal frame on Q and a > is a parameter specifying 
the field strength. According to the results of Section 4, constrained Brownian motion is 
a diffusion F"'' with generator 
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Here is an orthonormal frame of V and 11 : TQ = V (B — > V. We fix this frame 
to be 

ui = J^^d^, U2 = r] 2(9^-J^fs), U3 = e 

where 

e = m{a^ + b^) + Kc^, r/ = J^(l - ^). 

Note that r] and e are functions of (p only. A calculation now shows that we have, for the 
(trivial) connection V associated to fi, 

V^ui = 0, 

V„,M2 = '^{{dea)d^ + {deb)dy) G V^, 
^u,us = l{{dea)d^ + {deb)dy) G V^. 

Thus nVn^^Ma = for this frame and T^^ is given by the Stratonovich equation 

^r"^ = (5.6) 

As in the theory of [9], we fix the horizontal space of the principal bundle n : Q ^ 
Q/G = T'^ = M associated to the distribution V to be given by the span of {ui, U2}. The 
corresponding connection form is denoted by A. Consider the control vector fields 

U^it) = u^{t)d^, u^it) = sin(w^t) 

in the control space TM. Their horizontal lifts are hl^{U^) = u'^{t)d^ and hl'^(f/^) = 
u'^{t){d^ — J^^s). Combining this with (5.6) yields 

(5r" = hlV(t/0 + U^)5t + aUa{T'')5W' (5.7) 

which describes the stochastic perturbation of the controlled snakeboard with determinis- 
tic gait input (0,?/') = {u^{t),u^{t)). 

Since the variables (0, ip) are also the ones which can be controlled, we are interested 
in estimating given that the projected process Xt = n o satisfies the projected 
equation 

6X = (U^t) + U4t))6t + J^hw'd^ + 7]{X)-"26W^d^ . (5.8) 

This is the filtering problem _E'[r"|7r o F" = Xt] =: Zt and the solution is provided by 
[16, 17]: The process F" can be decomposed as 
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where Xj^ is the horizontal lift of = tt o F" and g is the reconstruction process. These 
satisfy the Stratonovich equations 

= hl-^xAU^ + U^)St + a{ui{X'')6W^ + U2{X^)6W^), 



x^ = rj; = go e g 



9^ 



eeG. 



and 

5gf = aT,LgX.Ax.u,{X^)5W\ 
(See also Section 3.C.) By [17] we have that 

Z, = E[gf]-Xl (5.9) 

Let X/^ = (04, ipt, Xt-, Vt, Ot) and E[g^] = {at, bt, 7*) G G. It follows from Proposition 3.3 
that the mean reconstruction curve E[g^] is determined by the time- and a; G (^-dependent 
ODE 



at 



E[gl 




(5.10) 



(a(0t, dt) + ytc{4>t)) sin(7t) - (6(0t, - Xtc{4>t)) cos(7t 
I (a(0i, + yc(0i)) cos(7t) - - a;tc(0i)) sin(7i) 







Using the rule for transforming Stratonovich equations to Ito type, we can characterize 
X^ by the Ito equation 



dX' 



\ 



_ T a(<l>t,0t)c((l>t) I I 2 



{dea){(l)t,0t 



c{<f>tf 



+ aui{X^)dW^ + au2{X^)dW'^ . (5.11) 



Equation (5.11) involves an iterated dependence on trigonometric functions, and hence 
numerical simulation is not straightforward. A naive approach would involve to run an 
Euler-Maruyama and an Euler simulation for (5.11) and (5.10) respectively, and to mul- 
tiply the results together according to (5.9) which is the action of G on Q. This yields 
Zt- Running the simulation sufficiently many times and computing the average yields the 
mean E[Zt\. We have implemented this scheme and the results seem reasonably stable up 
to time 1, according to a first order test. Beyond that time, the trajectories blow up very 
quickly, which is a strong indication that the method is unstable and a more detailed anal- 
ysis of the numerical implementation is necessary. Our preliminary results are contained 
in the plot below. 
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-0.05 1 ' ' ' ' ' ' ' ' 

-0.2 0.2 0.4 0.6 0.8 1 1.2 1.4 

The blue line is the center of mass motion of the unperturbed snake- 
board and and the dotted magenta line is the mean motion of the 
stochastic snakeboard with the same deterministic input. Additionally 3 
sample plots have been included. The data are as indicated above: T is 
the runtime, the step size, M the number of experiments, rad = ^ 
and a the parameter specifying the strength of the white noise. The ini- 
tial conditions are go = (0, 0, 0, 0, 0.5). 
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